Precise Regression Benchmarking with Random Effects: Improving Mono Benchmark Results

نویسندگان

  • Tomas Kalibera
  • Petr Tuma
چکیده

Benchmarking as a method of assessing software performance is known to suffer from random fluctuations that distort the observed performance. In this paper, we focus on the fluctuations caused by compilation. We show that the design of a benchmarking experiment must reflect the existence of the fluctuations if the performance observed during the experiment is to be representative of reality. We present a new statistical model of a benchmark experiment that reflects the presence of the fluctuations in compilation, execution and measurement. The model describes the observed performance and makes it possible to calculate the optimum dimensions of the experiment that yield the best precision within a given amount of time. Using a variety of benchmarks, we evaluate the model within the context of regression benchmarking. We show that the model significantly decreases the number of erroneously detected performance changes in regression benchmarking.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Mono Regression Benchmarking

Regression benchmarking is a methodology for detecting performance changes in software by periodic benchmarking. Detecting performance regressions in particular helps to improve software quality, similarly as regression testing, which however focuses only on software functionality. To achieve an acceptable level of false alarms, regression benchmarking requires a statistically sound planning an...

متن کامل

An Efficiency Measurement and Benchmarking Model Based on Tobit Regression, GANN-DEA and PSOGA

The purpose of this study is designing a model based on Tobit regression, DEA, Artificial Neural Network, Genetic Algorithm and Particle Swarm Optimization to evaluate the efficiency and also benchmarking the efficient and inefficient units. This model has three stages, and it uses the data envelopment analysis combined model with neural network, optimized by genetic algorithm, to evaluate the ...

متن کامل

Quality Assurance in Performance: Evaluating Mono Benchmark Results

Performance is an important aspect of software quality. To prevent performance degradation during software development, performance can be monitored and software modifications that damage performance can be reverted or optimized. Regression benchmarking provides means for an automated monitoring of performance, yielding a list of software modifications potentially associated with performance ch...

متن کامل

Random forest versus logistic regression: a large-scale benchmark experiment

The Random Forest (RF) algorithm for regression and classification has considerably gained popularity since its introduction in 2001. Meanwhile, it has grown to a standard classification approach competing with logistic regression in many innovation-friendly scientific fields. In this context, we present a large scale benchmarking experiment based on 260 real datasets comparing the prediction p...

متن کامل

Practical benchmarking in DEA using artificial DMUs

Data envelopment analysis (DEA) is one of the most efficient tools for efficiency measurement which can be employed as a benchmarking method with multiple inputs and outputs. However, DEA does not provide any suggestions for improving efficient units, nor does it provide any benchmark or reference point for these efficient units. Impracticability of these benchmarks under environmental conditio...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006